Skip to content

Update command to install compatible protobuf dependencies #3887

Open
lavakumarrepala wants to merge 4 commits intomainfrom
v-rlava/e2e-object-classification-distributed-pytorch
Open

Update command to install compatible protobuf dependencies #3887
lavakumarrepala wants to merge 4 commits intomainfrom
v-rlava/e2e-object-classification-distributed-pytorch

Conversation

@lavakumarrepala
Copy link
Copy Markdown
Member

@lavakumarrepala lavakumarrepala commented Apr 17, 2026

Description

Issue

Training jobs started failing during MLflow logging with:
AttributeError: 'google._upb._message.FieldDescriptor' object has no attribute 'label'
As a result:
MLflow metrics and params were not logged and mlflow.search_runs() returned incomplete results.
Downstream notebook analysis failed due to missing columns

Root Cause
The curated environment (AzureML-acpt-pytorch-2.8-cuda12.6@latest) picked up a newer protobuf version (v5+), which is incompatible with MLflow’s proto handling.
MLflow relies on legacy protobuf APIs (e.g., FieldDescriptor.label) that are no longer available in newer protobuf implementations.

Fix
Pin protobuf to a compatible version at runtime:
python -m pip install --no-cache-dir --no-deps "protobuf==3.20.3"

Checklist

  • I have read the contribution guidelines.
  • I have coordinated with the docs team (mldocs@microsoft.com) if this PR deletes files or changes any file names or file extensions.
  • Pull request includes test coverage for the included changes.
  • This notebook or file is added to the CODEOWNERS file, pointing to the author or the author's team.

@lavakumarrepala lavakumarrepala changed the title Update command to install dependencies and run training Update command to install protobuf dependencies Apr 22, 2026
@lavakumarrepala lavakumarrepala changed the title Update command to install protobuf dependencies Update command to install compatible protobuf dependencies Apr 22, 2026
Copy link
Copy Markdown
Member

@kingernupur kingernupur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavakumarrepala Thanks for the detailed description and analysis which is very helpful to understand the issue and the fix.

While the fix that you are proposing works, I feel that there is some gap in the understanding.

"MLflow relies on legacy protobuf APIs (e.g., FieldDescriptor.label) that are no longer available in newer protobuf implementations."
My takeaway from the statement above is that the mlflow itself isn't compatible with latest protobuf versions, in which case I would expect the mlflow itself to have this constraint in their library, which doesn't seem to be true given the errors that we are observing.
@jayesh-tanna Can you confirm this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants